Clustering of correlated networks 
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We obtain the clustering coefficient, the degree-dependent local clustering, and the mean clus- 
tering of networks with arbitrary correlations between the degrees of the nearest-neighbor vertices. 
The resulting formulas allow one to determine the nature of the clustering of a network. 
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In principle, loops, and, in particular, loops of length 
three which lead to the clustering of networks, are a 
specific kind of correlations. Usually, real-world net- 
works are strongly clustered structures, and many ef- 
forts were made to invent special mechanisms producing 
strong clustering even in small nets [1,2]. The number 
of the proposed mechanisms is rapidly growing, but the 
recent development of the field [3-6] shows that in very 
many real networks the high clustering is only a finite- 
size effect. So, in this case, no additional mechanism of 
strong clustering is needed. The problem is to reliably 
conclude whether or not the clustering of a real network 
is a finite-size effect which can be explained by using basic 
random graph constructions [7]. Evidently, comparison 
with results obtained in the framework of specific mod- 
els with many adjusting parameters cannot lead to any 
convincing conclusion. 

Another basic though particular kind of correlations 
in networks are correlations between the numbers of con- 
nections (degrees) of the nearest neighbor vertices [8-20] . 
Networks with these specific correlations are being exten- 
sively studied these days, and the term "correlated net- 
works" often implies just this type of correlations. These 
pair correlations were measured in a number of real net- 
works [9-14,19], so the joint distribution of the degrees 
of the nearest neighbor vertices, P{k, k') is considered as 
one of metrics of a network. Note that as a rule, these 
correlations do not vanish in the large network limit. 

The classical random graphs [21,22] with their Pois- 
son degree distribution provide a non-adequate image 
of a real complex network and a very weak clustering, 
C — k/N. Here k is the mean degree of a graph and 
N is its size (the total number of vertices). Random 
graphs with given degree-distribution P{k) (the config- 
uration model of mathematical graph theory [23]) are 
much closer to real complex networks. It is the values of 
the clustering coefficient of this model C oc N'~^ [7,24] 
that were compared with empirical data for real-world 
networks. 

The configuration model and its variations provide 
(uncorrelated) random graphs which are maximally ran- 
dom (i.e., with the maximum entropy) under the con- 
straint that their degree distribution is equal to a given 
one, P{k). These graphs are closer to reality than the 



classical random graphs, but the absence of correlations 
is a very restrictive factor. If we wish to make a step 
toward real networks, we have to introduce a network 
with degree-degree correlations, P{k,k'). The simplest 
formal way to do this is in the spirit of the configuration 
model. That is, consider random graphs which are max- 
imally random under the constraint that their degree- 
degree distribution is equal to a given one, P{k, k'). This 
is the minimal construction of a random graph with these 
correlations. In this construction, as the size of a graph 
approaches infinity, loops become insignificant, and the 
clustering vanishes [25]. 

In the present communication we obtain analytical ex- 
pressions for the complete list of the clustering character- 
istics of the random graphs with these important degree- 
degree correlations, see Eqs. (8)-(10). These formulas, 
after the substitution of a measured distribution P{k, k'), 
allow one to conclude whether or not the clustering of a 
real-world or generated network is simply a finite-size ef- 
fect, the same as in a maximally random graph with this 
degree-degree distribution. Furthermore, the resulting 
clustering characteristics are qualitatively different from 
those of uncorrelated networks. 

The graphs in this communication are completely de- 
scribed by the joint distribution P{k, k') of the degrees of 
end vertices of an edge of the graph, j,, P(fc, k') — 1, 
P{k,k') = P{k',k). The degree distribution P{k) is de- 
termined by P{k, k'): 
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where the mean degree k = (fc) = kP{k) is 
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In the following, we assume that the total number of ver- 
tices of the graph, TV, is large and consider only the main 
contribution to the clustering. 

P{k, k') can be obtained by using empirical data as fol- 
lows. If fc ^ k', P{k, k') = P{k' , k) is one half of the ratio 
of the number of edges connecting vertices of degrees k 
and k' to the total number of edges, L. L = kN/2. If 
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k = k', P{k, k) is the ratio of the number of edges con- 
necting vertices of degrees k and k to L. 

The set of clustering characteristics of networks, con- 
sidered up to now, includes: 

(i) The degree-dependent local clustering C{k). This is 
the mean relative number of connections (less than 1) 
between two nearest neighbors of a vertex of degree k: 



k{k -l)/2 



(3) 



where (m„n(fc)) is the average mmiber of connections be- 
tween the nearest neighbors of a vertex of degree k. 
(ii) The mean clustering (mean clustering coefficient), 
which is defined as 



C = ^P(fc)C(fc). 



(4) 



of the (j'- vertex is given by the product between two cen- 
tral dots on the right-hand part of the formula. The fac- 
tor F{q'\q) is evident: the second end of the edge must 
be of degree q'. So our edge must "choose" ("grasp") one 
of the <?' — 1 "free connections" of the g'-vertex among 
almost Nq'P{q') possibilities in the network. (All these 
possibilities are equiprobable in the construction which is 
considered here.) This is the total number of "free con- 
nections" provided by NP{q') vertices of degree q' in the 
network. This gives {q' - l)/[Nq'P{q')]. 
(iii) Finally, we must multiply this probability by the 
number g — 1 of the "free connections" of the g- vertex. 

The result is Eq. (7). Note that we used the fact that 
A'' is large and the probability that the edge between the 
nearest neighbor is present is small, so our formulas are 
asymptotic. Substituting Eq. (6) into Eq. (7) gives the 
degree-dependent local clustering 



(iii) The clustering coefficient, which is defined as 
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EfeP(fc)fc(fc-i)/2 
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(5) 



This coincides with the traditional definition: the cluster- 
ing coefficient is three times the ratio of the total number 
of loops of length three in a graph to the total number 
of connected vertex triples. In simple terms, this is the 
"concentration" of loops of length three. 

We shall obtain the clustering characteristics C{k), C, 
and C of correlated graphs, but let us first introduce the 
conditional probability P{k\k') that if one end vertex of 
an edge is of degree k', then its other end vertex is of 
degree k: 
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(6) 



Then the local clustering, that is, the probability that 
two nearest neighbors of a vertex of degree fc > 1 are 
connected is 

C{k)= J2 PiQ'\k)Piq\k)-Piq'\q)^^^ -{q-l). 

q,q'>l ^ ^ 

(7) 

One can easily understand this formula: 

(i) The first two factors P{q'\k)P{q\k) on the right-hand 
side, which should be accounted for before the summa- 
tion over q and q' , are evident: these are the probabilities 
that the vertices are of degrees q and q'. 

(ii) In fact, we must calculate the probability that the 
nearest neighbors with degrees q and q' of a vertex of de- 
gree k are connected to each other. We have two vertices 
with q—1 and q' — l "free connections" (apart of the con- 
nections to the mother vertex). Let us select one of the 
"free connections" of the g-vertex. The probability that 
this edge will "choose" one of the q' — l "free connections" 



C{k) = 
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the mean clustering 



qq'P{q)P{q') 



(8) 



r k' ^ iq'-l){q-l)Piq',q)P{q',k)P{q,k) 



k,q,q'>l 

and the clustering coefficient 



k'^qq'P{q)P{q')P{k) 



C 



N{{k^)-k) 



(fc - l){q' -l){q- l)P{q',q)P{q',k)P{q,k) 



k,q,q'>l 



kqq'P{q)P{q')P{k) 



(10) 



of the correlated network with given correlations P(fc, k'). 
The degree distribution P{k) in these formulas may be 
expressed in terms of P(fc, k') by using the relations (1) 
and (2). The results (8)-(10) may be written in a more 
compact form in terms of conditional probabilities, see 
Eq. (6), but the present form is more convenient for em- 
pirical researchers. 
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In uncorrelated networks, P{k,k') = kP{k)k' P{k')/k 
and the probability that the nearest neighbor of a vertex 
is of degree k is kP{k)/k. In this case, Eqs. (8)-(10) 
reduce to the known result [7,24] 



C{k) = C = C ■ 



(11) 



The formulas (8)-(ll) are asymptotically exact. 

Note that in uncorrelated networks, C{k) is indepen- 
dent of k and so all the three characteristics are equal. 
Cotrastingly, degree-degree correlations lead to a degree- 
dependent local clustering [see Eq. (8)]. Previously, this 
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feature was observed in a number of model and real net- 
works [26-31]. Here we demonstrate that this depen- 
dence and is a direet eonsequenee of degree- degree cor- 
relations. The degree-dependent local clustering leads to 
the difference between C and C, which were found in 
many real- world networks [28-30] . 

One should note that the formula (8) for the degree de- 
pendent local clustering resembles the expression (61) for 
the local clustering of a correlated network with hidden 
variables in the recent paper of Marian Boguiia and Ro- 
mualdo Pastor-Satorras, Ref. [32]. However, there is an 
essential difference between these two results. The result 
of Ref. [32] is C(fc), expressed in terms of the correlations 
of hidden variables ( "fitnesses" ) which were used to gen- 
erate a correlated network. It is impossible to find the 
exact form of these hidden variable correlations from em- 
pirical data. Contrastingly, Eq. (8) in the present work is 
obtained for a random network, which is completely de- 
scribed by P(fc, fc'), and expresses C{k) directly in terms 
of the observable degree degree distribution P{k,k'). It 
is the latter circumstance that allows one to use Eqs. 
(8)-(10) for the structural analysis of networks. 

The number of edges connecting vertices of degrees k 
and k' can be easily measured in any real- world or gener- 
ated network [11-13,19]. Substituting these numbers to- 
gether with the numbers of vertices of degree k into Eqs. 
(8)-(10) will provide one with the clustering characteris- 
tics of a maximally random graph with the same degree- 
degree correlations as the real network. These clustering 
characteristics may be compared with those of the real 
net. If the results are close enough, then the clustering of 
a net is explained by the basic correlated random graph 
construction and so is a simple finite-size effect. Only 
if the calculated characteristics differs strongly from the 
measured ones, the clustering has non-trivial nature. 

Note that in sparse networks, measured degree-degree 
distributions strongly fluctuates due to poor statistics. 
This factor cannot spoil the results (8) (10), since even 
strong fluctuations are summed out. 

One should indicate two restrictions, (i) The formu- 
las (8)-(ll) are asymptotic (large N, sufflcicntly "weak" 
clustering). So, one may hope that they are good if C is 
less than, say, 0.1, but only qualitative comparison is pos- 
sible if, e.g., C ~ 0.3. (ii) The growth of real- world net- 
works produces a wide spectrum of correlations, and the 
correlations between the degrees of the nearest-neighbor 
vertices are only one speciflc type of the correlations. The 
construction that is considered in this communication ig- 
nores the long-range and multi-vertex correlations. The 
empirical data on such correlations is absent. 

In summary, we obtained the clustering characteris- 
tics of networks with correlations between degrees of 
the nearest-neighbor vertices. These correlations are a 
common feature of real networks. Our formulas allow 
one to easily conclude whether or not the clustering of a 
network is determined by the form of its degree-degree 
distribution and so is a simple finite-size effect. So, Eqs. 
(8)-(10) can shed light on the nature of the clustering of 



networks. We hope that these simple expressions will be 
a useful tool at the analysis of real-world and generated 
networks. 
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